In this assignment, you will complete the implementation of the NeuralNetwork
class, starting with the code included in the 05
and 06
lecture notes. First, define your NeuralNetwork
class to include just one hidden layer, as done in notes 05
. Follow these steps:
__init__
function to accept three arguments, the number of inputs in each sample (columns of X
), the number of units in the hidden layer, and the number of outputs of the output layer, andself.n_inputs
, self.n_hiddens_each_layer
, and self.n_outputs
,self.Ws
, andself.rmse_trace
to an empty list._forward(self, X)
function that returns the output of the network, Y
, in standardized form and create self.Zs
as a list consisting of the input X
and the outputs of the hidden layer._gradients(self, X, T)
function that returns the gradients of the mean square error with respect to the weights in each layer._calc_rmse_standardized
as shown in notes 05.train(self, Xtrain, Ttrain, Xtest, Ttest, n_epochs, learning_rate)
function thatXtrain
and Ttrain
and saves the standardization parameters (means and stds) in member variables, self.X_means
, self.X_stds
, self.T_means
and self.T_stds
,Xtest
and Ttest
using self.X_means
, self.X_stds
, self.T_means
and self.T_stds
,n_epochs
as shown in notes 05
and for each loop,_forward
function to calculate the outputs of all units,_gradients
function to calculate the gradient of the the mean squared error respect to all weight matrices,self.rmse_trace
use(self, X)
thatX
using the standardization member variables,_forward
to calculate the outputs of all units,_add_ones
, to be called by the functions above. Remember to name functions with a leading _
that are not meant to be called by the users of your NeuralNetwork
class.No test your implementation. You may use the same example data as used in notes 05
. When you are happy with your test results:
NeuralNetwork
class code cell, and paste it after the code cells you used to test your one-hidden layer NeuralNetwork
class.[]
. Don't forget this case. The constructor, __init__
, must now accept a list of numbers of units in each hidden layer, rather than just a single number of units. The length of this list determines the number of hidden layers.See the following examples for more details. Then,
NeuralNetwork
class to the problem of predicting the value of concrete strength as described below.import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipd # for display and clear_output
import time
NeuralNetwork
Class - One Hidden Layer¶# insert your NeuralNetwork class definition here. This will be a large code cell when you are done!
class NeuralNetwork:
pass
In this next code cell, I add a new method to your class that replaces the weights created in your constructor with non-random values to allow you to compare your results with mine, and to allow our grading scripts to work well.
def set_weights_for_testing(self):
for W in self.Ws[:-1]: # leave output layer weights at zero
n_weights = W.shape[0] * W.shape[1]
W[:] = np.linspace(-0.01, 0.01, n_weights).reshape(W.shape)
for u in range(W.shape[1]):
W[:, u] += (u - W.shape[1]/2) * 0.2
# Set output layer weights to zero
self.Ws[-1][:] = 0
print('Weights set for testing by calling set_weights_for_testing()')
setattr(NeuralNetwork, 'set_weights_for_testing', set_weights_for_testing)
NeuralNetwork
Class - Multiple Hidden Layers¶When your second version is working, you may delete the above code cell that defines your first version of NeuralNetwork
.
class NeuralNetwork:
pass
# If you first develop your `NeuralNetwork` class in a python script file, named `A2mysolution.py`,
# you can import it here for testing.
# Before you check in your notebook, copy and paste the whole `NeuralNetwork` class definition in the
# above cell, and delete this cell.
# from A2mysolution import NeuralNetwork
Here we test your new NeuralNetwork
class that allows 0, 1, 2, or more hidden layers with some simple data.
X = np.arange(0, 10, 0.1).reshape(-1, 1)
T = np.sin(X) + 0.01 * (X ** 2)
X.shape, T.shape
((100, 1), (100, 1))
# Collect every 5th sample as the test set.
test_rows = np.arange(0, X.shape[0], 5)
# All remaining samples are in the train set.
train_rows = np.setdiff1d(np.arange(X.shape[0]), test_rows)
Xtrain = X[train_rows, :]
Ttrain = T[train_rows, :]
Xtest = X[test_rows, :]
Ttest = T[test_rows, :]
print(f'{Xtrain.shape=} {Ttrain.shape=} {Xtest.shape=} {Ttest.shape=}')
Xtrain.shape=(80, 1) Ttrain.shape=(80, 1) Xtest.shape=(20, 1) Ttest.shape=(20, 1)
plt.plot(Xtrain, Ttrain, 'o', label='Train')
plt.plot(Xtest, Ttest, 'o', label='Test')
plt.legend();
n_inputs = X.shape[1]
n_outputs = T.shape[1]
nnet = NeuralNetwork(n_inputs, [3, 2], n_outputs)
nnet
NeuralNetwork(1, [3, 2], 1)
nnet.n_inputs, nnet.n_hiddens_each_layer, nnet.n_outputs
(1, [3, 2], 1)
nnet.rmse_trace
[]
nnet.Ws
[array([[ 0.67704045, 0.70371353, 0.00252495], [ 0.55918031, -0.38161161, -0.05879949]]), array([[-0.3809577 , 0.38959034], [-0.01438485, 0.36708073], [-0.31925238, 0.49403834], [-0.13570525, 0.33183844]]), array([[0.], [0.], [0.]])]
nnet.set_weights_for_testing()
Weights set for testing by calling set_weights_for_testing()
nnet.Ws
[array([[-0.31 , -0.106, 0.098], [-0.298, -0.094, 0.11 ]]), array([[-0.21 , -0.00714286], [-0.20428571, -0.00142857], [-0.19857143, 0.00428571], [-0.19285714, 0.01 ]]), array([[0.], [0.], [0.]])]
nnet.train(Xtrain, Ttrain, Xtest, Ttest, n_epochs=1, learning_rate=0.1)
NeuralNetwork(1, [3, 2], 1)
nnet.Zs
[array([[-1.73291748], [-1.55962573], [-1.38633399], [-1.21304224], [-1.03975049], [-0.86645874], [-0.69316699], [-0.51987524], [-0.3465835 ], [-0.17329175], [ 0. ], [ 0.17329175], [ 0.3465835 ], [ 0.51987524], [ 0.69316699], [ 0.86645874], [ 1.03975049], [ 1.21304224], [ 1.38633399], [ 1.55962573]]), array([[ 2.03527172e-01, 5.68329348e-02, -9.23569751e-02], [ 1.53544458e-01, 4.05825180e-02, -7.34264441e-02], [ 1.02763480e-01, 2.43106038e-02, -5.44428527e-02], [ 5.14411404e-02, 8.02579805e-03, -3.54198229e-02], [-1.54354032e-04, -8.26326587e-03, -1.63710911e-02], [-5.17490267e-02, -2.45479456e-02, 2.68953195e-03], [-1.03068918e-01, -4.08196082e-02, 2.17482009e-02], [-1.53845874e-01, -5.70696481e-02, 4.07910762e-02], [-2.03823074e-01, -7.32895056e-02, 5.98043640e-02], [-2.52760066e-01, -8.94706847e-02, 7.87743562e-02], [-3.00437097e-01, -1.05604771e-01, 9.76874699e-02], [-3.46658600e-01, -1.21683448e-01, 1.16530286e-01], [-3.91255732e-01, -1.37698517e-01, 1.35289586e-01], [-4.34087939e-01, -1.53641907e-01, 1.53952389e-01], [-4.75043566e-01, -1.69505700e-01, 1.72505989e-01], [-5.14039585e-01, -1.85282137e-01, 1.90937983e-01], [-5.51020549e-01, -2.00963638e-01, 2.09236306e-01], [-5.85956907e-01, -2.16542814e-01, 2.27389262e-01], [-6.18842828e-01, -2.32012479e-01, 2.45385547e-01], [-6.49693682e-01, -2.47365664e-01, 2.63214279e-01]]), array([[-0.24026129, -0.00811343], [-0.23101806, -0.00792238], [-0.22158355, -0.00772975], [-0.21200657, -0.007536 ], [-0.20233913, -0.00734163], [-0.19263529, -0.00714712], [-0.18295003, -0.00695296], [-0.17333791, -0.00675965], [-0.16385194, -0.00656764], [-0.15454234, -0.00637739], [-0.14545559, -0.0061893 ], [-0.13663353, -0.00600376], [-0.12811273, -0.0058211 ], [-0.11992404, -0.00564161], [-0.11209237, -0.00546556], [-0.10463666, -0.00529315], [-0.09757005, -0.00512455], [-0.09090019, -0.00495988], [-0.08462964, -0.00479924], [-0.07875642, -0.00464269]]), array([[-0.00035099], [-0.00033749], [-0.00032372], [-0.00030973], [-0.00029561], [-0.00028144], [-0.00026729], [-0.00025326], [-0.0002394 ], [-0.00022581], [-0.00021254], [-0.00019965], [-0.00018721], [-0.00017525], [-0.00016381], [-0.00015292], [-0.0001426 ], [-0.00013286], [-0.0001237 ], [-0.00011512]])]
Why only 20 rows in these matrices? I thought I had 80 training samples!
print(nnet)
NeuralNetwork(1, [3, 2], 1) trained for 1 epochs with a final RMSE of None.
nnet.X_means, nnet.X_stds
(array([5.]), array([2.88530761]))
nnet.T_means, nnet.T_stds
(array([0.51792742]), array([0.74017845]))
[Z.shape for Z in nnet.Zs]
[(20, 1), (20, 3), (20, 2), (20, 1)]
nnet.Ws
[array([[-0.31 , -0.106, 0.098], [-0.298, -0.094, 0.11 ]]), array([[-0.21 , -0.00714286], [-0.20428571, -0.00142857], [-0.19857143, 0.00428571], [-0.19285714, 0.01 ]]), array([[5.46221054e-18], [1.45977062e-03], [3.30068017e-05]])]
dir(nnet)
['T_means', 'T_stds', 'Ws', 'X_means', 'X_stds', 'Zs', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_add_ones', '_calc_rmse_standardized', '_forward', '_gradients', 'n_epochs', 'n_hiddens_each_layer', 'n_inputs', 'n_outputs', 'rmse', 'rmse_trace', 'set_weights_for_testing', 'train', 'use']
def plot_data_and_model(nnet, Xtrain, Ttrain, Xtest, Ttest):
plt.clf()
plt.subplot(2, 1, 1)
plt.plot(nnet.rmse_trace)
plt.xlabel('Epoch')
plt.ylabel('RMSE')
plt.legend(('Train RMSE', 'Test RMSE'))
plt.subplot(2, 1, 2)
order = np.argsort(Xtrain, axis=0).flatten()
Xtrain = Xtrain[order]
Ttrain = Ttrain[order]
plt.plot(Xtrain, nnet.use(Xtrain), '-', label='Ytrain')
plt.plot(Xtrain, Ttrain, 'o', label='Ttrain', alpha=0.5)
order = np.argsort(Xtest, axis=0).flatten()
Xtest = Xtest[order]
Ttest = Ttest[order]
plt.plot(Xtest, nnet.use(Xtest), '-', label='Ytest')
plt.plot(Xtest, Ttest, 'o', label='Ttest', alpha=0.5)
plt.xlabel('X')
plt.ylabel('T or Y')
plt.legend();
X = np.arange(0, 10, 0.1).reshape(-1, 1)
T = np.sin(X) + 0.01 * (X ** 2)
# Collect every 5th sample as the test set.
test_rows = np.arange(0, X.shape[0], 5)
# All remaining samples are in the train set.
train_rows = np.setdiff1d(np.arange(X.shape[0]), test_rows)
Xtrain = X[train_rows, :]
Ttrain = T[train_rows, :]
Xtest = X[test_rows, :]
Ttest = T[test_rows, :]
print(f'{Xtrain.shape=} {Ttrain.shape=} {Xtest.shape=} {Ttest.shape=}')
n_inputs = X.shape[1]
n_outputs = T.shape[1]
nnet = NeuralNetwork(n_inputs, [10, 5], n_outputs)
nnet.set_weights_for_testing()
n_epochs = 10000
n_epochs_per_plot = 200
fig = plt.figure()
for reps in range(n_epochs // n_epochs_per_plot):
plt.clf()
nnet.train(Xtrain, Ttrain, Xtest, Ttest, n_epochs=n_epochs_per_plot, learning_rate=0.2)
plot_data_and_model(nnet, Xtrain, Ttrain, Xtest, Ttest)
ipd.clear_output(wait=True)
ipd.display(fig)
ipd.clear_output(wait=True)
X = np.arange(-2, 2, 0.02).reshape(-1, 1)
T = np.sin(X) * np.sin(X * 10)
rows = np.arange(X.shape[0])
np.random.shuffle(rows)
ntrain = int(len(rows) * 0.7)
Xtrain = X[rows[:ntrain], :]
Ttrain = T[rows[:ntrain], :]
Xtest = X[rows[ntrain:], :]
Ttest = T[rows[ntrain:], :]
print(f'{Xtrain.shape=} {Ttrain.shape=} {Xtest.shape=} {Ttest.shape=}')
n_inputs = X.shape[1]
n_outputs = T.shape[1]
nnet = NeuralNetwork(n_inputs, [50, 10, 5], n_outputs)
nnet.set_weights_for_testing()
n_epochs = 80000
n_epochs_per_plot = 1000
fig = plt.figure(figsize=(10, 8))
for reps in range(n_epochs // n_epochs_per_plot):
plt.clf()
nnet.train(Xtrain, Ttrain, Xtest, Ttest, n_epochs=n_epochs_per_plot, learning_rate=0.05)
plot_data_and_model(nnet, Xtrain, Ttrain, Xtest, Ttest)
ipd.clear_output(wait=True)
ipd.display(fig)
ipd.clear_output(wait=True)
Your results will not be the same, but your code should complete and make plots somewhat similar to these.
NeuralNetwork
class to some concrete data!¶Download data from Calculate Concrete Strength at Kaggle. Read it into python using the pandas.read_csv
function. Assign the first 8 columns as inputs to X
and the final column as target values to T
. Make sure T
is two-dimensional.
import pandas
# Read the csv file as a pandas.DataFrame
df = pandas.read_csv('concrete_data.csv')
Xd = df.iloc[:, range(8)]
X_names = Xd.columns
X = Xd.values
Td = df.iloc[:, 8:9]
T_names = Td.columns
T = Td.values
X.shape, X_names, T.shape, T_names
((1030, 8), Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer', 'Coarse Aggregate', 'Fine Aggregate', 'Age'], dtype='object'), (1030, 1), Index(['Strength'], dtype='object'))
Before training your neural networks, partition the data into training and testing partitions, as shown here.
rows = np.arange(X.shape[0])
np.random.shuffle(rows)
ntrain = int(0.9 * len(rows))
Xtrain = X[rows[:ntrain], :]
Ttrain = T[rows[:ntrain], :]
Xtest = X[rows[ntrain:], :]
Ttest = T[rows[ntrain:], :]
print(f'Concrete: {Xtrain.shape=}, {Ttrain.shape=}, {Xtest.shape=}, {Ttest.shape=}')
Concrete: Xtrain.shape=(927, 8), Ttrain.shape=(927, 1), Xtest.shape=(103, 8), Ttest.shape=(103, 1)
Use your NeuralNetwork
class to train a model that predicts the concrete strength from the eight input values. Experiment with a variety of neural network structures (numbers of hidden layer and units) including no hidden layers, learning rates, and numbers of epochs. Show results for at least three different network structures, learning rates, and numbers of epochs for a total of at least 27 results. Show your results in a pandas
DataFrame with columns ('Structure', 'Epochs', 'Learning Rate', 'Train RMSE', 'Test RMSE')
.
Try to find good values for the RMSE on testing data. Discuss your results, including how good you think the RMSE values are by considering the range of concrete strength values given in the data.
Your notebook will be run and graded automatically. Test this grading process by first downloading A2grader.zip and unzip A2grader.py
from it. Run the code in the following cell to demonstrate an example grading session. The remaining 20 points will be based on your discussion of this assignment.
A different, but similar, grading script will be used to grade your checked-in notebook. It will include additional tests. You should design and perform additional tests on all of your functions to be sure they run correctly before checking in your notebook.
For the grading script to run correctly, you must first name this notebook as A2solution.ipynb
, and then save this notebook. Check in your A2solution.ipynb
notebook when you are ready.
%run -i A2grader.py
======================= Code Execution ======================= Extracting python code from notebook named A2solution.ipynb and storing in notebookcode.py Removing all statements that are not function or class defs or import statements. Testing n_inputs = 3 n_hiddens = [2, 1] n_outputs = 2 n_samples = 5 X = np.arange(n_samples * n_inputs).reshape(n_samples, n_inputs) * 0.1 T = np.hstack((X, X*2)) nnet = NeuralNetwork(n_inputs, n_hiddens, n_outputs) nnet.set_weights_for_testing() # Set standardization variables so use() will run nnet.X_means = 0 nnet.X_stds = 1 nnet.T_means = 0 nnet.T_stds = 1 Y = nnet.use(X) Weights set for testing by calling set_weights_for_testing() --- 20/20 points. Returned correct value. Testing n_inputs = 3 n_hiddens = [] # NO HIDDEN LAYERS. SO THE NEURAL NET IS JUST A LINEAR MODEL. n_samples = 5 X = np.arange(n_samples * n_inputs).reshape(n_samples, n_inputs) * 0.1 T = np.hstack((X, X*2)) n_outputs = T.shape[1] nnet = NeuralNetwork(n_inputs, n_hiddens, n_outputs) nnet.set_weights_for_testing() nnet.train(X, T, X, T, 1000, 0.01) Y = nnet.use(X) Weights set for testing by calling set_weights_for_testing() --- 20/20 points. Returned correct value. Testing n_inputs = 3 n_hiddens = [20, 20, 10, 10, 5] n_samples = 100 X = np.arange(n_samples * n_inputs).reshape(n_samples, n_inputs) * 0.1 T = np.log(X + 0.1) n_outputs = T.shape[1] Xtrain = X[np.arange(0, n_samples, 2), :] Ttrain = T[np.arange(0, n_samples, 2), :] Xtest = X[np.arange(1, n_samples, 2), :] Ttest = T[np.arange(1, n_samples, 2), :] def rmse(A, B): return np.sqrt(np.mean((A - B)**2)) nnet = NeuralNetwork(n_inputs, n_hiddens, n_outputs) nnet.set_weights_for_testing() nnet.train(Xtrain, Ttrain, Xtest, Ttest, 6000, 0.01) Ytest = nnet.use(Xtest) err = rmse(Ytest, Ttest) print('RMSE', rmse(Ytest, Ttest)) Weights set for testing by calling set_weights_for_testing()
nbconvert: jupyter: command not found
RMSE 0.0760118437192574 --- 40/40 points. Returned correct value. ====================================================================== A2 Execution Grade is 80 / 80 ====================================================================== ___ / 10 Correctly ran the required experiments with results in a pandas dataframe. ___ / 10 Provided a sufficient description (at least 10 sentences) of your experiments and results. ====================================================================== A2 Experiments and Discussion Grade is __ / 20 ====================================================================== ====================================================================== A2 FINAL GRADE is ___ / 100 ====================================================================== Extra Credit: Apply your functions to a data set from the UCI Machine Learning Repository. Explain your steps and results in markdown cells. A2 EXTRA CREDIT is 0 / 1
Apply your multilayer neural network code to a regression problem using data that you choose from the UCI Machine Learning Repository or the Kaggle Datasets. Pick a dataset that is listed as being appropriate for regression.